AITopics | weak rule

Collaborating Authors

weak rule

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:35 GMT

dataset, sparrow, weak rule, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Self-Training with Weak Supervision

Karamanolakis, Giannis, Mukherjee, Subhabrata, Zheng, Guoqing, Awadallah, Ahmed Hassan

arXiv.org Machine LearningApr-12-2021

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such settings to automatically generate weakly labeled training data. However, learning with weak rules is challenging due to their inherent heuristic and noisy nature. An additional challenge is rule coverage and overlap, where prior work on weak supervision only considers instances that are covered by weak rules, thus leaving valuable unlabeled data behind. In this work, we develop a weak supervision framework (ASTRA) that leverages all the available data for a given task. To this end, we leverage task-specific unlabeled data through self-training with a model (student) that considers contextualized representations and predicts pseudo-labels for instances that may not be covered by weak rules. We further develop a rule attention network (teacher) that learns how to aggregate student pseudo-labels with weak rule labels, conditioned on their fidelity and the underlying context of an instance. Finally, we construct a semi-supervised learning objective for end-to-end training with unlabeled data, domain-specific rules, and a small amount of labeled data. Extensive experiments on six benchmark datasets for text classification demonstrate the effectiveness of our approach with significant improvements over state-of-the-art baselines.

dataset, supervision, unlabeled data, (17 more...)

arXiv.org Machine Learning

2104.05514

Country:

South America > Brazil (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
Oceania > Australia (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Faster Boosting with Smaller Memory

Alafate, Julaiti, Freund, Yoav

arXiv.org Machine LearningJan-25-2019

The two state-of-the-art implementations of boosted trees: XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing boosted trees. which achieves a significant speedup over XGBoost and LightGBM, especially when memory size is small. This is achieved using a combination of two techniques: early stopping and stratified sampling, which are explained and analyzed in the paper. We describe our implementation and present experimental results to support our claims.

dataset, lightgbm, sparrow, (14 more...)

arXiv.org Machine Learning

1901.09047

Country:

Asia > Japan (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.91)

Add feedback

Tell Me Something New: A New Framework for Asynchronous Parallel Learning

Alafate, Julaiti, Freund, Yoav

arXiv.org Machine LearningMay-23-2018

We present a novel approach for parallel computation in the context of machine learning that we call "Tell Me Something New" (TMSN). This approach involves a set of independent workers that use broadcast to update each other when they observe "something new". TMSN does not require synchronization or a head node and is highly resilient against failing machines or laggards. We demonstrate the utility of TMSN by applying it to learning boosted trees. We show that our implementation is 10 times faster than XGBoost and LightGBM on the splice-site prediction problem.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1805.07483

Country: North America > United States > California (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The Boosting Approach to Machine Learning

#artificialintelligenceApr-30-2018, 16:36:40 GMT

Boosting is an ensemble technique that attempts to create a strong classifier from a number of weak classifiers. This is one of the most powerful techniques for building predictive models. It can help improve algorithm accuracy and the robustness of a model. Ensemble learning uses hundreds to thousands of models of the same algorithm that work together to find the correct classification. This can be achieved by building a model from the training data, then creating a second model that attempts to correct the errors from the first model. Models are added until the training set is predicted perfectly or a maximum number of models are added.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.38)

Add feedback

Quick Introduction to Boosting Algorithms in Machine Learning

#artificialintelligenceMay-27-2016, 20:06:12 GMT

Lots of analyst misinterpret the term'boosting' used in data science. Let me provide an interesting explanation of this term. Boosting grants power to machine learning models to improve their accuracy of prediction. Boosting algorithms are one of the most widely used algorithm in data science competitions. The winners of our last hackathons agree that they try boosting algorithm to improve accuracy of their models.

algorithm, artificial intelligence, machine learning, (16 more...)

#artificialintelligence

Genre: Contests & Prizes (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)

Add feedback